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ABSTRACT 

In this paper, we present a Case Based Reasoning (CBR) 
system for the retrieval of medical cases made up of a se- 
ries of images with semantic information (such as the patient 
age, sex and medical history). Indeed, medical experts gene- 
rally need varied sources of information, which might be in- 
complete, uncertain and conflicting, to diagnose a pathology. 
Consequently, we derive a retrieval framework from Bayesian 
networks and the Dezert-Smarandache theory, which are well 
suited to handle those problems. The system is designed so 
that heterogeneous sources of information can be integrated in 
the system: in particular images, indexed by their digital con- 
tent, and symbolic information. The method is evaluated on 
a classified diabetic retinopathy database. On this database, 
results are promising: the retrieval precision at five reaches 
80.5%, which is almost twice as good as the retrieval of sin- 
gle images alone. 

Index Terms — Case based reasoning, Image indexing, 
Bayesian networks, Dezert-Smarandache theory, Diabetic 
Retinopathy 


1. INTRODUCTION 

In medicine, the knowledge of experts is a mixture of text- 
book knowledge and experience through real life clinical 
cases. Consequently, there is a growing interest in case-based 
reasoning (CBR), introduced in the early 1980s, for the de- 
velopment of medical decision support systems [1]. The 
underlying idea of CBR is the assumption that analogous 
problems have similar solutions, an idea backed up by physi- 
cians’ experience. In CBR, the basic process of interpreting 
a new situation revolves around the retrieval of relevant cases 
in a case database. The retrieved cases are then used to help 
interpreting the new one. 

We propose in this article a CBR system for the retrieval of 
medical cases made up of a series of images with contextual 
information. The proposed system is applied to the diagnosis 
of Diabetic Retinopathy (DR). Indeed, to diagnose DR, physi- 
cians analyze series of multimodal photographs together with 


contextual information like the patient age, sex and medical 
history. 

When designing a CBR system to retrieve such cases, several 
problems arise. We have to aggregate heterogeneous sources 
of evidence (images, nominal and continuous variables) and 
to manage missing information. To solve these problems, 
we propose to express the different sources of information as 
probabilities and to model the relationships between each at- 
tributes with a Bayesian network. The Bayesian network may 
be used to fuse the sources of information. However, these 
sources may be uncertain and conflicting. As a consequence, 
we also applied the Dezert-Smarandache Theory (DSmT) 
of plausible and paradoxical reasoning, proposed in recent 
years [2], which is better suited than Bayesian approach to 
fuse uncertain, highly conflicting and imprecise sources of 
evidence. 

2. DIABETIC RETINOPATHY DATABASE 






Fig. 1. Photograph series of a patient eye 
Images (a), (b) and (c) are photographs obtained by applying dif- 
ferent color filters. Images (d) to (j) form a temporal angiographic 
series: a contrast product is injected and photographs are taken at 
different stages (early (d), intermediate (e)-(i) and late (j)). 


Diabetes is a metabolic disorder characterized by sus- 
tained inappropriate high blood sugar levels. This progres- 
sively affects blood vessels in many organs, including the 
retina, which may lead to blindness. The database is made 
up of 63 patient files containing 1045 photographs altogether. 
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Patients have been recruited at Brest University Hospital 
since June 2003 and images were acquired by experts using 
a Topcon Retinal Digital Camera (TRC-50IA) connected to 
a computer. Images have a definition of 1280 pixels/line for 
1008 lines/image. The contextual information available is 
the patients’ age and sex and structured medical information 
(about the general clinical context, the diabetes context, eye 
symptoms and maculopathy). Thus, at most, patients records 
are made up of 10 images per eye (see figure 1) and of 13 
contextual attributes; 12.1% of these images and 40.5% of 
these contextual attribute values are missing. The disease 
severity level, according to ICDRS classification [3], was 
determined by experts for each patient. 

3. BAYESIAN NETWORKS AND THE 
DEZERT-SMARANDACHE THEORY 

3.1. Bayesian Networks 

A Bayesian network [4] is a probabilistic graphical model 
that represents a set of variables and their probabilistic de- 
pendencies. It is a directed acyclic graph whose nodes re- 
present variables, and whose arcs encode conditional inde- 
pendencies between the variables. Each arc in the graph is 
associated with a conditional probability matrix express- 
ing the probability of a child variable given one of its 
parent variables. A directed acyclic graph is a Bayesian 
Network relative to a set of variables {Ai,...,A n } if the 
joint distribution P(X {, ..., X n ) can be expressed as follows: 
P(X i,...,X n ) = niLi P(Xi\parents(Xi)). The network 
structure and conditional probability tables can be learnt au- 
tomatically from data [5]. 

A Bayesian network is used to answer probabilistic queries 
about the variables; typically to find out updated knowledge 
of the state of a subset of variables when other variables (the 
evidence variables) are observed. This process of computing 
the posterior distribution of variables given evidence is called 
probabilistic inference. It can be used to fuse evidence from 
several sources of information. 

3.2. Dezert-Smarandache Theory 

The Dezert-Smarandache Theory allows combining any types 
of independent sources of information represented in term of 
belief functions. It is more general than probabilistic (or 
Bayesian) fusion, discussed above, or Dempster-Shafer the- 
ory. It is particularly well suited to fuse uncertain, highly 
conflicting and imprecise sources of evidence [ 2 ]. 

Let 9 = {0i, 02? •••} be a set of hypotheses under considera- 
tion for the fusion problem; 0 is called the frame of discern- 
ment. In Bayesian theory, a probability p(9i) is assigned to 
each element 9i of the frame, such that ^2 d . ee p(9i) = 1 . 
More generally, in DSmT, a belief mass m(A) is assigned to 
each element A of the hyper-power set D(9), i.e. the set of all 
composite propositions built from elements of 0 with D and 


U operators, such that m(0) = 0 and ^AeD(6>) m (^) = 1* 
The belief mass functions specified by the user for each 
source of information, noted rrij , j = 1 ..TV, are fused into 
the global mass function m/, according to a given rule of 
combination. Several rules have been proposed to combine 
mass functions, including the hybrid rule of combination or 
the PCR (Proportional Conflict Redistribution) rules [2]. It 
is possible to introduce constraints in the model [ 2 ] : we can 
specify pairs of incompatible hypotheses ( 0 a , 0 fr), he. each 
subset A of 9 a n 9 5 must have a null mass, noted A £ C(9). 
Once the fused mass function rrif has been computed, a 
decision function is used to evaluate the probability of each 
hypothesis, one of these functions can be used: the credibility, 
the plausibility or the pignistic probability [ 2 ]. 

4. IMAGES IN THE BAYESIAN NETWORK 

To include images in a Bayesian network, we associate a va- 
riable Fj with each imaging modality j. We have to define a 
finite number of states for these variables. In that purpose, we 
apply a principle similar to Content-Based Image Retrieval 
(CBIR) [ 6 ]. CBIR involves 1) building a signature for each 
image (i.e. extracting a feature vector summarizing their nu- 
merical content), and 2 ) defining a distance measure between 
two signatures. Thus, measuring the distance between two 
images comes down to measuring the distance between two 
signatures. Similarly, in a Bayesian network, we cluster simi- 
lar image signatures (according to the defined distance mea- 
sure) and associate a state of Fj for each image cluster. 

In previous studies, we proposed to compute a signature for 
images from their wavelet transform (WT) [7] . These signa- 
tures model the distribution of the WT coefficients in each 
subband of the decomposition. The associated distance mea- 
sure D [7] computes the divergence between these distribu- 
tions. We used these signature and distance measure to clus- 
ter similar images. 

Any clustering algorithm can be used, provided that the dis- 
tance measure between feature vectors can be specified. We 
used FCM ( Fuzzy C-Means) [ 8 ], one of the most common 
algorithms, and replaced the Euclidian distance by D. 

5. BAYESIAN NETWORK BASED RETRIEVAL 

Let x q be a case placed as a query. To assess the relevance 
of each case x in the database, we define a Bayesian network 
with the following variables: a variable Fj, j = 1..N, repre- 
senting each feature of x and a Boolean variable Q = “the 
query is satisfied” ( Q = “the query is not satisfied”). 

To build the network, we first learn the relationships between 
the feature variables Fj, j = 1..N, from data [5]: we have 
thus built a sub-network, independent on both x q and x (see 
figure 2 (a)). 

Q is then integrated in the network: x q specifies which fea- 
tures should be found in the retrieved cases, so when the j th 
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(a) Query independent network layer 



(b) Bayesian network based method 


where S(x,c ) = 1 if x is in class c, 5 (x,c) = 0 otherwise, 
and /? is a normalizing factor. Sjk 1 k 2 is given by equation 2: 

c 

Sjkik 2 = ^ ^ DjkicDjk 2 c ( 2 ) 

c=l 

In the proposed model, we choose P(Q\Fj = fjk) propor- 
tional to Ya=i °tji(x q )S jk i . 

The different cases in the database are then processed sequen- 
tially. To evaluate a case x, every available feature for x is 
processed as evidence to infer the posterior probability P(Q) 
(see figure 2 (b)). The cases are then ranked in decreasing 
order of P(Q). 

6. BAYESIAN NETWORK AND DSMT BASED 
RETRIEVAL 


© 

CD CD d 3 > © 




[fm ' | 

n ITS 

6 


CD 

fqr' 


CD 4 

^ <2> '?p 

© 

© 

>■ 

L4XC CD 

CD 


tD d) 

(c) Bayesian network + DSmT based method 


Fig. 2. Evaluating a case £ by the two proposed methods. Fi- 
gure (a) describes the query independent network layer, learnt 
from data. Figure (b) (resp. figure (c)) describes the method 
presented in section 5 (resp. section 6). In this example, fea- 
tures 6, 7, 14, 15, 16, 20, 22 and 23 are available for x q . Evi- 
dence nodes are grey. In figure (c), 0 represents the fusion 
operator. 


feature of x q is available, we connect the two nodes Q and 
Fj (see figure 2 (b)). If a node Fj and Q are connected, we 
have to estimate the associated conditional probability matrix 
P(Fj = fjk\Q), where fjk denotes the k th possible state for 
Fj , according to x q . To compute P(Fj = fjk\Q), we first es- 
timate P(Q\Fj = fjk) by the procedure below and we apply 
Bayes theorem. 

To estimate P(Q\Fj = fjk), we use the membership degree 
of x q to each state fjk of Fj , noted ajk(x q ). We assume that 
the state of the cases in the same class are predominantly in 
a subset of states for Fj . So, in order to estimate the condi- 
tional probabilities, we use a correlation measure Sjk ± k 2 bet- 
ween two feature states / J / Cl and fjk 2 , regarding the class of 
the cases at these states. To compute Sjk x k 2 , we first compute 
the mean membership Djk lC (resp. Djk 2C ) of cases in a given 
class c to the state / J / Cl (resp. fjk 2 ) (equation 1): 


_ QZ^ x °y x X)ocj k yx) 

U jkc - P F, S(x t c) 
£c C =i ( D jkc) 2 = l,V(j,fc) 


( 1 ) 


To extend the previous Bayesian network based method to 
the DSmT framework, we assign a belief mass not only to Q 
and Q , but also to Q U Q (not to Q n Q because Q and Q are 
incompatible hypotheses). 

To compute the belief masses rrij for a given feature Fj , 
we define a test Tj on the degree of match drrij(x,x q ) bet- 
ween x and x q . drrij(x,x q ) is defined as drrij(x,x q ) = 
Y2k = fjk) a jk{%) and Tj is defined as “ drrij(x , x q ) 

> Tj ”, 0 < Tj < 1. The sensitivity (resp. the specificity) of 
test Tj represents the degree of confidence in a positive (resp. 
negative) answer to the test. Whether the answer is positive 
or negative, Q U Q is assigned the degree of uncertainty. The 
mass functions are then assigned according to Tj. If Tj is 
true: 

rrij(Q ) = P(Tj\x relevant for x q ) —> sensitivity 
mj (Q U Q) = 1 - ruj ( Q ) (3) 

mj(Q) = 0 

Otherwise: 

m j ( Q ) = P(Tj \ x no t relevant for x q ) — > specificity 
rrij (Q U Q) = 1 - mj(Q) 
mj(Q) = 0 

( 4 ) 

We want to define Tj so that is both sensitive and specific. As 
Tj increases, sensitivity increases and specificity decreases. 
So, we set Tj as the intersection of the two curves “sensiti- 
vity according to r/’ and “specificity according to r/’. Tj is 
searched by the bisection method: for each value of Tj eval- 
uated, sensitivity and specificity are estimated from the cases 
of the database. 

Finally, to evaluate a case x with this model (see figure 2 (c)), 
every available feature for x is processed as evidence to esti- 
mate ajk(x) V/c,j = 1..N. If the j th feature of x q is avai- 
lable, the degree of match dmj(x,x q ) is computed and the 
belief masses are computed according to test Tj . The sources 
available for x q are then fused with the PCR5 rule [2] and the 
pignistic probability of Q , noted betP(Q ), is computed. The 
cases are then ranked in decreasing order of betP(Q). 
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7. RESULTS 

The mean precision at five, i.e. the mean number of rele- 
vant cases among the top five results, reaches 69.5% using 
the Bayesian network based system, and 80.5% using the 
Bayesian network and DSmT based system. As a comparison, 
the mean precision at five obtained by CBIR (when cases are 
made up of a single image) with the same image signatures is 
46.1% [7]. To evaluate the contribution of the proposed sys- 
tem for the retrieval of heterogeneous and incomplete cases, 
the proposed method is compared to a linear combination of 
heterogeneous distance functions, managing missing values 
[9], which is the natural generalization of classic CBR to the 
studied cases. Its extension to vectors containing images is 
based on the distance between image signatures (see section 
4). A mean precision at five of 52.3% was achieved by this 
method. To evaluate the contribution of each attribute, we 
give in figure 3 the sensitivity and specificity of each test Tj . 
The method is robust regarding missing information: indeed 
for instance, the mean retrieval precision at five is 88.2% for 
examples with 22 available attributes out of 23, and 71.5% for 
examples with 12 available attributes. 



Fig. 3. Influence of each descriptor. The sensitivity followed 
by the specificity of each test Tj is given for each descrip- 
tor (the same letter than in figure 1 are used to denote image 
modalities). 


8. DISCUSSION AND CONCLUSION 

In this article, we introduce a method to include image series 
and their numerical signatures, with contextual information, 
in CBR systems. In particular, a way to include image signa- 
tures in a Bayesian network was proposed. Two retrieval sys- 
tems, based on the same principle, were proposed: a Bayesian 
network is used to model the relationships between case des- 
criptors and thus handle missing information, and relevance 
information, coming from each descriptor are fused by ei- 
ther Bayesian fusion or DSmT. In this system, DSmT shows 
its superiority over Bayesian fusion. Bayesian Networks are 
however efficient for managing missing information. On this 


database, the method outperforms our first CBIR algorithm by 
a factor of 175% in precision (80.5% as opposed to 46.1%). 
This stands to reason since an image alone is generally not 
sufficient for experts to correctly diagnose the disease severity 
level of a patient. However, figure 3 shows that each single 
image are relevant attributes. Besides, this non-linear retrieval 
method is 154% (80.5% as opposed to 52.3%) more precise 
than a simple linear combination of heterogeneous distances 
on the DR database. The proposed framework is also inte- 
resting for being generic: any multimedia database may be 
processed so long as a procedure to cluster cases is provided 
for each new modality (sound, video, etc). 
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